DOCOMBMT BESONE 



ED 103 557 



UD on 932 



EDRS PRICE 
DESCEIPTORS 



AOTROR Franklin, Anderson J. 

TITLE The Testing Dilemma for Minorities* 

POB DATE Oct 7a 

NOTE 15p.; Paper presented at the Public Hearings on 

Statewide Testing and Evaluation, State of New York 
(Albany, N.Y., October 197a) 

MP-$0,76 HC-$1.58 PLUS POSTAGE 
Achievement Tests; Aptitude Tests^' <fEducational 
}?olicy; '•'Educational Testing; Government Role; 
♦••Minority Group Children; Public Policy; Racial 
Discrimination; Social Discrimination; ^fstate 
Government; state Programs; 4iTest Bias; Testing 
Problems; Testing Programs; Test Validity 

ABSTRACT 

The document states that certain steps need to be 
taken immediately for rectifying and containing the Injustices of 
testing. Until such time that the State can demonstrate unequivocally 
that their statewide testing and evaluation program is fair to all 
groups, and tUat every student has had an equal exposure to quality 
school environments before evaluation then there should be a 
moratorium on testing. The State should establish a task force for 
the development of an Office of Consumer Affairs in Testing and 
Student Evaluation. The State should establish a Research and 
Development Office which will have the latitude to study empirical 
questions of teacher and pupil performance. It is most important that 
evaluative agencies recognize that tests and their ensuing social 
judgments are instruments of racism by virtue of minority exclusion 
in all phascjs of test utilizations. Moreover since minorities have 
limited access to the opportunity (mainstream) structures of this 
society, much less policy making positions, it is obvious that 
decisions on criterion variables (job or education) have negligible 
minority inputs. Since racism has been an integral characteristic of 
the p(/W®t brokers in this country, and the testing industry caters to 
the power brokers, there is no reason to assume that testing has the 
best interests of minorities at heart. (Author/JH) 
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Introduction 



In trying to prepare a statement on statewide testing it again brought home the 
indisputable fact that this evaluative proceis has enormous shortcomings and abundant 
complexities. However we continue to pursue this assessment technique with a 
relentlessness that has to bring our professional judgments into serious question. The 
debate over testing, its biases, its fairness, its consequence:^ continues incessantly. Like 
tfiat rat on the activity wheel our learned discussions on the test issue is getting us nowhere 
fast. The only vivid truth about our testing practices is that individuals are having their 
lives charted for better or for worse by assessment techniques which are at best still 
primitive. Perhaps our serious error In judgment is the lack of cont ol we exert over the 
utilization of these instruments. There is no eff(»ctive monitoring of how these tests are 
employed. It is ludicrous to assume that admonitions written in test manuals will be 
strictly adhered to by the user. The fact -that test limitations are a major psychometric issue 
Is negligible for most agencies employing tests. The test user is seeking an expedient method 
of evaluation and selection. Our profession has provided that solution In the form of 
innumerable standardized instruments. But in supplying that demand we have neglected 
to Impress upon the users the deficiencies of the tool placed In their hands. In effect the 
misuse and abuse of tests becomes the liability of the test professionals by default and 
abdication of responsible guidance In Its use. Consequently we end up In perennial 
meetings like this one today trying to resolve a dilemma that seems Insoluble by the 
very persistence of the Issue throughout the history of testing. I attribute part of the 
problem to the Inescapable social Issues dominating the u<;6 of tests. Whenever o device 
Is used ^0 determine the status of Indlvlduals^dlscrlmlnatlon occurs that carries just as 
much $oclal Implications for me «s It does technlcol . 



In my opinion the responsibility of tliis panel in considering tlie situation of 
statewide testing encompasses tlie review of both the technical and social implications 
of pupil evaluation. The Regent examinations, the Pupil Evaluation Program and all 
other statewide tests are subject to technica! and social scrutiny. Parenthetically I 
include the growing legal challenges confrontiig the test industry within the social 
domain. We cannot overlook the stigma associated with poor performance on tests. The 
consequences of negative labeling to the personal development of the individual has 
been adequately articulated in the book Stigma by Goffman (1963). No responsible State 
authority would be party to such a process. Given the tests employed by the State plus 
the testing dilemma for minorities I cannot in good conscience absolve the State from 
participating in the unfair classification of minority students with their Statewide Testing 
program. For the protection of student welfare ihis is an area in need of careful scrutiny. 
It is one that I and other colleagues will begin tc examine closely. But just as our efforts 
develop in this domain I would hope that the substance of my statement here today will 
begin to move the State into more careful consideration of the consequences of its Statewid 
Testing program to the development of the individual, not the economics of education or 
the whims of legislative and political demcnds. Consequently^ I would like to put several 
questions before this panel which if solutions are seriously sought in response to them we 
may achieve movement towards a better educational system. 

My questions are as follows ; 

1. In the current State testing program^ what is the ultimate assessment purpose 
for every instrument used? Is it for expediency or education? 

2. Given the tests employed by the State^what is the actual utility of them? 
Do they provide sufficient information on the school's role'in student 
preparation and intellectual development? Do they contribute to edu- 
cational planning or primarily classification In a meritocracy? 
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3. What is the range and adequacy of ihe criterion variables In evaluating student 
and school performance. Similarly to what extent has evaluation procedures 
been limited to quantitative assessment methods and criteria in contradistinction 
to qualitative variables, (e.g., motivation - teachers' and students')? 

4. What is the interaction betweei school performance and student performance? 
Are there comparable measures c." competency in the school as there are for the 
student? If not^then whet is the value of judgments on the intellectual develop- 
ment of the pupil ? 

5. If the standardization process in the development of tests (particularly item 
selection) has not had sufficient minority group representation and professional 
Input then on whose standard is evaluotion being performed? Parenthetically^ 
how is the differential development o' specific groups as defined by sex, age, 
race, geographical location, social class, etc., considered in pupil evaluation? 
Moreover what is the State's responsibility in the fate of many children classified 
as oducably retarded or slow learners because of culturally biased and unfair tests 
as well as inequity In the quality of school experiences? 

6. Within the total scope of the statewide evaluative system to what extent is the 
ecology of learning and development for specific groups considered? 

7. What is the status of fundamental resecirch on education and assessment in the 
State Education Department? Furthermore^ what has the Department contributed 
to our understanding of the learning piocess in students? 

If we equivocate on providing substantial answers to these questions or If the answers are 

inadequate then what human right do we possess to pass judgment on and determine the 

lives of our children from flimsy pupil evaluation techniques? 

Recommendations 

In my opinion certain steps need to be taken immediately for rectifying and containing 

the injustices of testing. . The following is a partial list of recommendations to improve 

Statewide responsibility in the area. 

I. Until such time that the State can demonstrate unequivocally that their statewide 
testing and evaluation program is fair to all groups, and that every student has had 
an equal exposure to quality school environments before evaluation then I reaffirm 
the resolution of the Association of Black Psychologists and recently the NAACP 
and the National Education Association (see Journal of School Psychology, 1973) 
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which declares a moratorium on telling. This should not be construed that the 
search for alternative means of assessment be abandoned, but that the current 
testing practices be halted. 

2. The State should establish a t(«s'< force for the development of an Office of 
Consumer Affairs in Testing an J Student Evaluation. The responsibilities of 
this Office would include the fol 'owing: 

a. consumer advocacy regarding lie use and misuse of tests as well as 
advising parents of their legal rifjhts in the testing of their children. 

b. advocating for the adoption of a "Truth -in -Testing" law in the State 
Legislature. 

c. a test review board to scrutinize and systematically monitor test utility, 
development, policies and practices of all agencies employing assessment 
procedures . 

d. the development of Informational advisory centers plus layman documents 
on testing for parents of school children. 

e. the development of comprehensive Statewide Standards on Testing which 
reflect the interests of minority groups on the testing issue. 



3. The State should establish a Research end Development Office vhich will have 
the latitude to study empirical questions of teacher and pupil performance in 
contrast to an Office which functions as a statistics mill for legislative 
accountability. 



As further food for thought the remainder of my comments on the testing diUmma 
for minorities come from the working supposition that the systematic exclusion of minorities 
at all phases of test development to utilization presents a major concern in the evaluation 
of minority intellectual development and subsequent achievements in life. 
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I . Statement of the Issues 

Tests constitute a major dilemma for minorities. (See Williams, 1971; Jones, 1972; 
Flaugher, 1973) They often have performc-ces which fall below the norm. Because of 
this performance many of their lives become ^. -bjected to below-standard opportunities. 
In schools they are tracted and in job:* they are screened out. The relegation of an 
individual to a less than opportunist position is to deny his human rights. However daily, 
momentous decisions of this kind are made for minority populations based on the frailty 
of standardized instruments. The reliance on test information in our society - and throughout 
the world - has achieved the stature of an institutionalized practice. The demands of a 
highly industrialized and technocratic society requires expediency and this is what tests 
offer - expediency. But what price is paid for the rush toward evaluative decisions of 
this nature? It is usually the sacrificing of individuality plus the consideration of the 
human capacity for resiliency, adaptation and rai«;9e of capabilities. I remember distinctly 
in a class with Leona Tyler, a distinguished Counseling and Measurement Psychologist and 
former President of the American Psychological Association, in which she postulated a 
theory of possibilities to characterize human potential and risks in development. The 
essential point of this theory was that an individual possesses a raige of potential abilities. 
Its development and ultimate manifestation was in part determined by access to the proper 
experiences of life. Maximum development of a potential was contingent upon the nature 
of specific exposures experienced by the individual. The significance of this theory for a 
discussion of the testing isiue is the social neglect of the individual's range of possibilities 
to perform by the specificity of test demands and the narrovz-mindedness whirh ensues from 
test interpretations and standards. 
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This inevitably leads to the issue of te^H abuse. For if tests are designed to 
differentiate then someon* must fall in the relative position of lower status. For minorities 
the lower status tends to be the parennial oosition. The question is why and what oi" the 
consequences. The arguments are abundant L)ut they continually return to the issue of 
culturally biased tests. One can spend an eternity debating the fin^ points of this position. 
As muddled as the issue is^there is some legitimai./ to the argument that the diversity of 
cultural backgr^ij!rf?3^ in a sample population is given little consideration in test development. 
This is most reflected in the content of tests, its instructions and procedures of administration. 
Likewise, statistical procedures cannot be the principal accounting tool in the normalization 
process. I feel the major point to be understood in the "culture bias issues" is the fact that 
we know very little about how cultural ethos and ethnocentricism define the ecology of 
learning and developmont for groups of people. |1 is this concern which lends credence to 
the claim of cultural bias in tests. 

Cultural fciirness of tests is another rationale to explain the testing dilemma for miin- 
orities (Flaugher, 1973). Eells, Davis, Havighurst, et. al. (1950) attempted to examine 
cultural factors in test item responses and one of their conclusions is that our focus should 
be on how "fair" a test is to given populations. Such a focus does not reTiove the injustices 
tests heave upon the less fortunate minorities but it begins to shift our sights from the 
cultural validity of the test content to the predictive validity of an instrument. With 
greater representation of diverse populations in the standardization process and improved 
specification of criterion variables, prediction is refined. But this does not prevent the stigma 
of categorization from poor test performance.. Although our tests may become more "fair" 
through sample representation, the sorting of group performances for prediction does not 
resolve the testing dllemfna. Value judgnrsnts still dominote the determination of criterion 
variables. g 
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It is important to remember the purpose of tests In this technocratic society — 
i.e. screening. This not only includes the tests as screening tools but the perceived 
utility of the instrument by the test users. \^ Is at this juncture that the establishment 
of cutt-off scores becomes of paramount importance. If a selective process were not 
involved, the jpurpose of testing would be negligible. However the primary role of testing 
is to sort into the "haves and the have-nots. " Testing Is a very elitist process In practical 
objectives. We take the "cream off the top" In order to fulfill our Institutional require- 
ments. The chance of having failures after sorting Is reduced accordingly. For an 
instrument to achieve such a selective capacity there are a number of factors involved 
in Its construction/ administration and utilization. Within these three areas Is the heart 
of the testing dilemma for minorities. To what extevit are minority interests considered 
during these phases — very often little. The purpose for a test Is the primary Instigator In 
its construction. It Is at this point that the foundc/lon for the development of a tool Is 
articulated between the agency or group In need and the test developers. Within this 
sanctum the practical utility is delineated as well as measurement objectives. Needless 
to say that the adage "a test Is no better than Its developers" fits most appropriately at this 
point. On countless occasions when minority group bitterness over testing was at Its 
ostensible zenith a common complaint was the exclusion of minority Input during test 
development or even revisions. Consequently the dilemma of tests for minorities can be 
considered as beginning here. 

But the dilemma becomes more confounded with the administration of tests. There 
are serious problems In this area also. The way a test administrator presents the task can be 
a formidable encounter. Testing Is alread/ shrouded with the stigma of sorting. The 
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Individual knows performance weights heavily in the direction his future life will take. 
Apprehension is frequently the psychological state that individuals bring to the testing 
situation. It is a known fact that test anxia^y can have a deleterious effect on performance 
(Sarason, I960 ). Very often, howfjver, test aci.rinistrators in their presumed desire to 
conform to procedures disregard the elements o! apprehension which frequently loom in 
the testing room. The atmosphere established by the administrator is too often impersonal. 
The measurement legitimacy of this posture is nor :o much questioned as is the utility of 
this affectation for engendering maximum performance from test takers. Within our 
every-day learning environments, both formal and informal, where performance for survival 
really counts the demands for production are not so artificial . Conseq'jentl)^ the testing 
environment turns off pro-duction rather than enhancing it. I do not mean to construe 
that this phenomena is limited to minority populalions but the experience does have its 
peculiar reception by minority persons. Mo//ever -.nother aspect of test administr'^tion 
has a real differential impact on minority populations and that is the delivery of instructions 
by the test administrator. This is most poignantly represented by a group of students I work 
with in a Sigh school equivalency program. Teachers in this program were distressed at the 
results of stud( ■ scores on the equivalency examination. The number of faikrej was in 
marked contrast to what p.3rformance on work book tasks would have predict*^''. Pressed to 
determin^i the cause of such test performance, they eventually learned from careful inter- 
viewing and backi-raoking tha!- students did not understand the test administrator adequately. 
The test administrator's affect and diction during presentation of instructions was inhibltive. 
I have no fears of reinforcing racist stereotypes when I sa/ that affecj^and diction are an 
integral part of bl«^ck people's communication processes as well as those of many other 
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minorities of color — be they Hispanic or Native American. Given this fact there is a 
serious discontinuity between the formal and sterile test situation and that of the every-day 
communication arena, Some may say this i a debatable issue or irrelevant to the situational 
requirements; but the main point to consider is that test administration objectives may not be 
conducive for maximizing performance (Hertzig, Birch, and Thomas,1969), 

One other issue within the domain of test administration is the language of instructions, 
and the permissibility to deviate from them to facilitate comprehension. Much to the surprise 
of my teacher friends in the equivalency program their flexbility in providing numerous 
interpretations of instructions in the work manual until student comprehension was indisput- 
able contributed greatly to the successful performance. In this instance the language as wall 
03 style of presenting test instructions often becomes an obstacle to minority students, and thus 
the testing dilemma is compounded. 

Utilization of tests is another area of cent/al concern in the testing dilemma from 
minorities. This incorporates not only test interpretation and the ensuing consequences but 
also the process of test selection. The tests we select almost mandate its utilization objectives. 
This is^in pai-t/Ju3 to the specific scope and assessment purpose of the instrument. Invariably 
the tests vye select are for predictive purposes. Our interests are in determining the probability 
of an individual's success within our institution. Consequently , a dependency on the norms 
emerges for the judgment process (Cleary, 1968). The crucial issue in normative data is its 
representation of the population. Its predictive value and the line of demarcation for cut-off 
scores. Thorndike (197i), Darlington (1971), Cole (1972) and other measurement specialists 
discussed the problems associated with cut-off scores and culture fairness of tests. The 
inclusion and exclusion of individuals by the cut-off score doas not consider adequately the 
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slopa of the regression curve for individual gro'jps in relationship to the test and criterion 
variables. Consequsntly absolute dsponHnncy on the norms for selection decisions is less 
efficacious thai an inclusion of a number o*^ criteria, both quantitative and qualitative. 
There is still considerable ovar and under prediction from our assessment insfrumonts. This 
Is^in part^dus to the fast that the success of ind'^'idujls below the cut-off scores is still 
probabilistic. These are important technical issues that test users should be sensitive to 
but often unaware of when considering the utilir/ of a selection instrument. 

A growing complaint among the many community and professional groups !s the 
Incongruity between the task of tests and orvthe-job task requirements, i.e. the spaclfication 
of the criterion variable. In many agencies the tests employed have little relationship to 
the job. Personnel and admission offices have o;)2ra*ed on the premise of the correlation 
between standardized instruments and the probability of job or school success. It is not 
often considered that poor performers on the test tasks may have the requisite capabitltie.3 
for on-the-job performance, This, however, would require other a.ssessmont techniques. 
Since time and mon-ayare factors in developing any diversified and unique or specific 
assessment approach, it is convenient to stick with instru.Tients with correlationol value. 
The utility of this procedure may be expsdient but it does raise the issue of fairness, 
Many potentially successful panons are being excluded by this technique. During the 
days of the guild system there was merit in the apprenticeship program because It at least 
allowed the Individual to succeed or fail at the job he mu$t master. To a degree the open 
admissions policy in some state college systems is offering the sa-no kind of opportunity. It 
ma/ reflect a sink or swim philosophy but at least the chance to plun.ge Into the stream of 
activity is offered. 
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Assumptions In Testing 

The validity of a tost Is perhaps one of the most heatedly discussed Issues throughout 
lay and professional circles. It certainly !s not an Issue that con be easily resolved. On 
the other hand those of us who have been ey^osed to measurement courses have frequently 
be.n Informed of certain objectlvai to obtain h test development regarding validity, A test 
is considered valid If it meets ope or more crlteila. Several kinds of validity concepts must 
be considered In the process of test development — i.e., construct validity; content 
validity, predictive validity, face validity, and concurrent validity (Cronboch, 1970). 

It is the content validity area which forms the substance of the culture bios Issue. 
The cultural origin of task content does effect cognitive performance (Franklin, 1974), For 
the reslsters of testing however, the major validity argument hinges on the exclusion of 
capable persons by tests regardless of the technlual Integrity of the instrument. Since 
opportunity selection has greater social than ted- ileal Implications the latter argument of 
test reslsters has considerable credence* 

Con c lusion 

In just this brief review of some Issues In the testing dilemma for minorities It Is 
apparent that the problem Is Immense, It h greatly obfuscated by the social concerns of 
the lay community and the questioned professional Integrity of measurement speclollsts. In 
spite of this conflict It Is. most Important that evolusMvo agiificles recognize thot tests Ohd 
thelf ensuing s^ol judgments ore Instruments of rjclsm by virtue of minority exclusion In 
cll phoses of test utilization, Moreover since minorities have limited access to the opportunity 
(m'alnstreom) structures of this society much loss policy making positions It Is obvious that 
decisions on criterion vcirtoblo's (job or education) have neflllglbl® minority Input, So ©up lives 
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are entrusfed to the "goodwill" of rhe ©stabllshors of the criteria. There Is nothing In 
the history of this country which rrakes me believe that benevolence has won over racism 
In the Interests of minorities. Consequent y,slnce racism has been an Integral characteristic 
of the power brokers In this country^ and the testing Industry caters t» the power brokers 
there Is no reason to assume that testing has tf o best Interests of minorities at heart. 
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